Mots Visuels Pour Le Calcul De Pose (visual Words for Pose Computation)
نویسنده
چکیده
Estimating the pose (position and orientation) of a camera with respect to a xed 3D coor-dinate system from an image of the surrounding environment captured in that pose has manyapplications. In our work we try to establish 2D-to-3D point correspondences between the im-age and the environment in order to apply Perspective-n-Point (PnP) algorithm to compute thepose. Each 2D-to-3D point correspondence associates the 2D image coordinates of an imagepoint to the 3D coordinates of the corresponding point in the environment. In order to establisha 2D-to-3D point correspondence through a 3D point captured in an image, we should know(i)3D coordinate of the point and (ii)quanti ed visual characteristics of the point which can beused to identify its 2D location in a given image. We propose a framework in which we com-pute the 3D map i.e. 3D coordinates and visual characteristics of some of the points in theenvironment, through an o ine training stage using a set of training images of the environment.Given a new test image we establish the 2D-to-3D point correspondences by using the 3D mapto detect some of the 3D points visible in the image. During the training stage we perform SfM(Structure from Motion) on training images to compute 3D coordinates of some of the pointsin the environment through . In order to perform SfM we need 2D-tracks of 3D points in thetraining images. Each 2D-track consists of a set of 2D image coordinates of a single 3D point indi erent training images. We establish 2D-track by clustering the SIFT descriptors computedfrom training images. The 2D positions of the SIFT descriptors in a cluster is used to establisha 2D-track. SfM computes the 3D coordinates of the points corresponding to these 2D-tracksby minimizing the reprojection error. During the test stage, the SIFT descriptors associated the2D-track of a 3D point is used to recognize the 3D point in a given image. The overall processis similar to visual word framework used in di erent elds of computer vision. During training,visual word formation is performed through clustering and during testing 3D points are identi edthrough visual word recognition. We experiment with di erent clustering schemes (k-means andmean-shift) and propose a novel scheme for visual word formation. We evaluate di erent aspectsof the quality of 2D-tracks and the 3D points computed from these schemes during training.During test stage, we experiment with di erent matching rules including some of the popularsupervised pattern classi cation methods to perform visual word recognition. We evaluate thevarious recognition strategies based on the accuracy of pose computation under varying degreeof pose di erence between training and test images. In order to achieve robustness against posevariation between train and test images, we explore di erent ways of incorporating SIFT descrip-tors extracted from synthetic views generated from the training images. While experimentingwith mean-shift clustering we conceived a strategy to accelerate its computation by dividing theset of training vectors into groups such that vectors in one group will never in uence the com-putation of clusters in another group. We present the acceleration strategy and mathematicalproof with experimental evaluation.
منابع مشابه
A review of weighting schemes for bag of visual words image retrieval
Current studies on content-based image retrieval mainly rely on bags of visual words. This model of image description allows to perform image retieval in the same way as text retrieval: documents are described as vectors of (visual) word frequencies, and documents are match by computing a distance or similarity measure between the vectors. But instead of raw frequencies, documents can also be d...
متن کاملSensor-based control of nonholonomic mobile robots
The problem of tracking a moving target with a nonholonomic mobile robot, by using sensor-based control techniques, is addressed. Two control design methods, relying on the transverse function approach, are proposed. For the first method, sensory signals are used to calculate an estimate of the relative pose of the robot with respect to the target. This estimate is then used for the calculation...
متن کاملFusion de ressources hétérogènes pour la recherche d'information multilingue
RÉSUMÉ. Afin d’améliorer la recherche multilingue dans le moteur de recherche Sinequa Engine, nous avons intégré les connaissances multilingues du service Sensagent au module de requêtes du moteur de recherche Sinequa Engine. L’interface développée propose une extension de la requête aux choix de l’utilisateur par traduction des différents mots dans les langues sélectionnées. Pour limiter le gr...
متن کاملThe Data-Dependence Graph of Adjoint Programs
Automatic Di erentiation is a technique that permits generation of adjoint programs, which compute gradients. In scienti c computation, these gradients are a fundamental tool for optimization or data assimilation. Computation of a gradient is relatively expensive, and should therefore be optimized whenever possible. The study of these program optimizations is most often based on the data-depend...
متن کاملA New Formulation for Non-linear Camera Calibration Using Virtual Visual Servoing Éric Marchand and François Chaumette
This paper presents a new formulation for the non-linear calibration problem. We propose a method based on the well known visual servoing approach. We consider pose computation and lens calibration as the dual problem of visual servoing. It allows to take advantage of all the research that have been carried out in this domain in the past. The proposed method features accuracy, eeciency, scalabi...
متن کاملA Two-Stage Robust Statistical Method for Temporal Registration from Features of Various Type
A model registration system capable of tracking an object, the model of which is known, in an image sequence is presented. It integrates tracking, pose determination and updating of the visible features. The heart of our system is the pose computation method, which handles various features (points, lines and free-form curves) in a very robust way. It consists in using robust estimators in a two...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013